CIND820 Project Course Code

Ryan Boyd - 501072988

Assignment 3

Initial Results and Code

Data is loaded and prepared

The full original dataset is too large to work with so it is clipped to a smaller study area. The dataframe is queried by the study area longitude and latitude boundary. The pre-classified ground points, class 2, are removed (since we are not concerned with ground right now), class 1 are the unclassified points and we only want to work with these.

Data normalized and preprocessed for analysis

Add Imagery Data

2015 Imagery data was obtained from the City of Vancouver to extract the RGB values

The image was clipped using external software (QGIS, open-source mapping program) to the same area of interest as above

The selected image size is 4084x4084, the lidar data is normalized by 4084 to extract the nearest pixel value(r,g,b) from the image for each point

The nearest R,G,B pixel value from the image is extracted for each lidar point and the results are saved as a field in the data frame

The R,G,B values are normalized like the rest:

Dataset statistics and information - exploratory

Additonal testing and improvements

Initial Classification (unsupervised) using kmeans clustering

Attempt to classify points into undetermined classes based on the data

Variables: Height, Intensity, R, G, B

Visualization

Spatial/Distance Clustering

This section will attempt to use the previous classificiation label to cluster the points into local distinct objects

Visualization

Testing Alternative models

Analysis of results

The elevation has a large effect on the classification and the R,G,B values from imagery causes the shadows to be very apparent, I may need to adjust the weighting of these values to mitigate issues. I will need to look into neighbourhood characteristics of groups of several points such as flat surfaces or planes and sharp edges, I did read about this in the research papers but it appears to be somewhat technically intensive for my purposes but I will investigate.

Next Steps: -Optimize value of K for K-means, (elbow method or similar) -Update the Z values to be a relative height to the ground as opposed to an absolute or (mean sea level) height. -Optimize the parameters to improve results (eps,min_samples,leaf_size,etc.) -Switch orthophoto for one without shadows -Find method to extract clusters as features -Create data for known features to compare -Find method to compare results to known features -Determine the evaluation criteria and minimum viable product -Evaluate performance of models and project results overall